Lectura y manipulación de datos con Pandas

Autor: Roberto Muñoz
E-mail: rmunoz@uc.cl

This notebook shows how to create Series and Dataframes with Pandas. Also, how to read CSV files and creaate pivot tables. The first part is based on the chapter 3 of the Python Data Science Handbook.


In [33]:
import numpy as np

from __future__ import print_function

In [34]:
import pandas as pd
pd.__version__


Out[34]:
'0.19.2'

1. The Pandas Series Object

A Pandas Series is a one-dimensional array of indexed data. It can be created from a list or array as follows:


In [3]:
data = pd.Series([0.25, 0.5, 0.75, 1.0])
data


Out[3]:
0    0.25
1    0.50
2    0.75
3    1.00
dtype: float64

As we see in the output, the Series wraps both a sequence of values and a sequence of indices, which we can access with the values and index attributes. The values are simply a familiar NumPy array:


In [4]:
data.values


Out[4]:
array([ 0.25,  0.5 ,  0.75,  1.  ])

The index is an array-like object of type pd.Index, which we'll discuss in more detail momentarily.


In [5]:
data.index


Out[5]:
RangeIndex(start=0, stop=4, step=1)

Like with a NumPy array, data can be accessed by the associated index via the familiar Python square-bracket notation:


In [6]:
data[1]


Out[6]:
0.5

Series as generalized NumPy array

From what we've seen so far, it may look like the Series object is basically interchangeable with a one-dimensional NumPy array. The essential difference is the presence of the index: while the Numpy Array has an implicitly defined integer index used to access the values, the Pandas Series has an explicitly defined index associated with the values.


In [7]:
data = pd.Series([0.25, 0.5, 0.75, 1.0],
                 index=['a', 'b', 'c', 'd'])
data


Out[7]:
a    0.25
b    0.50
c    0.75
d    1.00
dtype: float64

And the item access works as expected:


In [8]:
data['b']


Out[8]:
0.5

Series as specialized dictionary

In this way, you can think of a Pandas Series a bit like a specialization of a Python dictionary. A dictionary is a structure that maps arbitrary keys to a set of arbitrary values, and a Series is a structure which maps typed keys to a set of typed values. This typing is important: just as the type-specific compiled code behind a NumPy array makes it more efficient than a Python list for certain operations, the type information of a Pandas Series makes it much more efficient than Python dictionaries for certain operations.


In [9]:
population_dict = {'Arica y Parinacota': 243149,
                   'Antofagasta': 631875,
                   'Metropolitana de Santiago': 7399042,
                   'Valparaiso': 1842880,
                   'Bíobío': 2127902,
                   'Magallanes y Antártica Chilena': 165547}
population = pd.Series(population_dict)
population


Out[9]:
Antofagasta                        631875
Arica y Parinacota                 243149
Bíobío                            2127902
Magallanes y Antártica Chilena     165547
Metropolitana de Santiago         7399042
Valparaiso                        1842880
dtype: int64

You can notice the indexes were sorted lexicographically. That's the default behaviour in Pandas


In [10]:
population['Arica y Parinacota']


Out[10]:
243149

Unlike a dictionary, though, the Series also supports array-style operations such as slicing:


In [11]:
population['Metropolitana':'Valparaíso']


Out[11]:
Metropolitana de Santiago    7399042
Valparaiso                   1842880
dtype: int64

2. The Pandas DataFrame Object

The next fundamental structure in Pandas is the DataFrame. Like the Series object discussed in the previous section, the DataFrame can be thought of either as a generalization of a NumPy array, or as a specialization of a Python dictionary. We'll now take a look at each of these perspectives.

DataFrame as a generalized NumPy array

If a Series is an analog of a one-dimensional array with flexible indices, a DataFrame is an analog of a two-dimensional array with both flexible row indices and flexible column names.


In [12]:
# Area in km^2
area_dict = {'Arica y Parinacota': 16873.3,
             'Antofagasta': 126049.1,
             'Metropolitana de Santiago': 15403.2,
             'Valparaiso': 16396.1,
             'Bíobío': 37068.7,
             'Magallanes y Antártica Chilena': 1382291.1}
area = pd.Series(area_dict)
area


Out[12]:
Antofagasta                        126049.1
Arica y Parinacota                  16873.3
Bíobío                              37068.7
Magallanes y Antártica Chilena    1382291.1
Metropolitana de Santiago           15403.2
Valparaiso                          16396.1
dtype: float64

Now that we have this along with the population Series from before, we can use a dictionary to construct a single two-dimensional object containing this information:


In [13]:
regions = pd.DataFrame({'population': population,
                       'area': area})
regions


Out[13]:
area population
Antofagasta 126049.1 631875
Arica y Parinacota 16873.3 243149
Bíobío 37068.7 2127902
Magallanes y Antártica Chilena 1382291.1 165547
Metropolitana de Santiago 15403.2 7399042
Valparaiso 16396.1 1842880

In [14]:
regions.index


Out[14]:
Index(['Antofagasta', 'Arica y Parinacota', 'Bíobío',
       'Magallanes y Antártica Chilena', 'Metropolitana de Santiago',
       'Valparaiso'],
      dtype='object')

In [15]:
regions.columns


Out[15]:
Index(['area', 'population'], dtype='object')

DataFrame as specialized dictionary

Similarly, we can also think of a DataFrame as a specialization of a dictionary. Where a dictionary maps a key to a value, a DataFrame maps a column name to a Series of column data. For example, asking for the 'area' attribute returns the Series object containing the areas we saw earlier:


In [16]:
regions['area']


Out[16]:
Antofagasta                        126049.1
Arica y Parinacota                  16873.3
Bíobío                              37068.7
Magallanes y Antártica Chilena    1382291.1
Metropolitana de Santiago           15403.2
Valparaiso                          16396.1
Name: area, dtype: float64

Constructing DataFrame objects

A Pandas DataFrame can be constructed in a variety of ways. Here we'll give several examples.

From a single Series object

A DataFrame is a collection of Series objects, and a single-column DataFrame can be constructed from a single Series:


In [17]:
pd.DataFrame(population, columns=['population'])


Out[17]:
population
Antofagasta 631875
Arica y Parinacota 243149
Bíobío 2127902
Magallanes y Antártica Chilena 165547
Metropolitana de Santiago 7399042
Valparaiso 1842880

From a dictionary of Series objects

As we saw before, a DataFrame can be constructed from a dictionary of Series objects as well:


In [18]:
pd.DataFrame({'population': population,
              'area': area}, columns=['population', 'area'])


Out[18]:
population area
Antofagasta 631875 126049.1
Arica y Parinacota 243149 16873.3
Bíobío 2127902 37068.7
Magallanes y Antártica Chilena 165547 1382291.1
Metropolitana de Santiago 7399042 15403.2
Valparaiso 1842880 16396.1

3. Reading a CSV file and doing common Pandas operations


In [19]:
regiones_file='data/chile_regiones.csv'
provincias_file='data/chile_provincias.csv'
comunas_file='data/chile_comunas.csv'

regiones=pd.read_csv(regiones_file, header=0, sep=',')
provincias=pd.read_csv(provincias_file, header=0, sep=',')
comunas=pd.read_csv(comunas_file, header=0, sep=',')

In [20]:
print('regiones table: ', regiones.columns.values.tolist())
print('provincias table: ', provincias.columns.values.tolist())
print('comunas table: ', comunas.columns.values.tolist())


regiones table:  ['RegionID', 'RegionNombre', 'RegionOrdinal']
provincias table:  ['ProvinciaID', 'ProvinciaNombre', 'RegionID']
comunas table:  ['ComunaID', 'ComunaNombre', 'ProvinciaID']

In [21]:
regiones.head()


Out[21]:
RegionID RegionNombre RegionOrdinal
0 1 'Arica y Parinacota' 'XV'
1 2 'Tarapacá' 'I'
2 3 'Antofagasta' 'II'
3 4 'Atacama' 'III'
4 5 'Coquimbo' 'IV'

In [22]:
provincias.head()


Out[22]:
ProvinciaID ProvinciaNombre RegionID
0 1 'Arica' 1
1 2 'Parinacota' 1
2 3 'Iquique' 2
3 4 'El Tamarugal' 2
4 5 'Antofagasta' 3

In [23]:
comunas.head()


Out[23]:
ComunaID ComunaNombre ProvinciaID
0 1 'Arica' 1
1 2 'Camarones' 1
2 3 'General Lagos' 2
3 4 'Putre' 2
4 5 'Alto Hospicio' 3

In [24]:
regiones_provincias=pd.merge(regiones, provincias, how='outer')
regiones_provincias.head()


Out[24]:
RegionID RegionNombre RegionOrdinal ProvinciaID ProvinciaNombre
0 1 'Arica y Parinacota' 'XV' 1 'Arica'
1 1 'Arica y Parinacota' 'XV' 2 'Parinacota'
2 2 'Tarapacá' 'I' 3 'Iquique'
3 2 'Tarapacá' 'I' 4 'El Tamarugal'
4 3 'Antofagasta' 'II' 5 'Antofagasta'

In [25]:
provincias_comunas=pd.merge(provincias, comunas, how='outer')
provincias_comunas.head()


Out[25]:
ProvinciaID ProvinciaNombre RegionID ComunaID ComunaNombre
0 1 'Arica' 1 1 'Arica'
1 1 'Arica' 1 2 'Camarones'
2 2 'Parinacota' 1 3 'General Lagos'
3 2 'Parinacota' 1 4 'Putre'
4 3 'Iquique' 2 5 'Alto Hospicio'

In [26]:
regiones_provincias_comunas=pd.merge(regiones_provincias, comunas, how='outer')
regiones_provincias_comunas.index.name='ID'
regiones_provincias_comunas.head()


Out[26]:
RegionID RegionNombre RegionOrdinal ProvinciaID ProvinciaNombre ComunaID ComunaNombre
ID
0 1 'Arica y Parinacota' 'XV' 1 'Arica' 1 'Arica'
1 1 'Arica y Parinacota' 'XV' 1 'Arica' 2 'Camarones'
2 1 'Arica y Parinacota' 'XV' 2 'Parinacota' 3 'General Lagos'
3 1 'Arica y Parinacota' 'XV' 2 'Parinacota' 4 'Putre'
4 2 'Tarapacá' 'I' 3 'Iquique' 5 'Alto Hospicio'

In [27]:
#regiones_provincias_comunas.to_csv('chile_regiones_provincia_comuna.csv', index=False)

4. Loading ful dataset


In [35]:
data_file='data/chile_demographic.csv'
data=pd.read_csv(data_file, header=0, sep=',')
data


Out[35]:
RegionID Region Provincia Comuna Superficie Poblacion Densidad IDH_2005
0 1 Arica y Parinacota Arica Arica 4799.4 210936 38.4 0.736
1 1 Arica y Parinacota Arica Camarones 3927.0 679 0.3 0.751
2 1 Arica y Parinacota Parinacota General Lagos 2244.4 739 0.5 0.670
3 1 Arica y Parinacota Parinacota Putre 5902.5 1462 0.2 0.707
4 1 Arica y Parinacota Iquique Alto Hospicio 572.9 94455 87.6 NaN
5 2 Tarapacá Tamarugal Camiña 2200.2 1156 0.5 0.619
6 2 Tarapacá Tamarugal Colchane 4015.6 1384 0.4 0.603
7 2 Tarapacá Tamarugal Huara 10474.6 2360 0.2 0.676
8 2 Tarapacá Iquique Iquique 2242.1 184953 82.4 0.766
9 2 Tarapacá Tamarugal Pica 8934.3 4194 0.6 0.793
10 2 Tarapacá Tamarugal Pozo Almonte 13765.8 11519 0.7 0.722
11 3 Antofagasta Antofagasta Antofagasta 30718.1 348669 9.7 0.734
12 3 Antofagasta El Loa Calama 15596.9 147886 9.1 0.757
13 3 Antofagasta Tocopilla María Elena 12197.0 4593 0.6 0.779
14 3 Antofagasta Antofagasta Mejillones 3803.9 9752 2.2 0.730
15 3 Antofagasta El Loa Ollagüe 2964.0 332 0.1 0.679
16 3 Antofagasta El Loa San Pedro de Atacama 23439.0 5605 0.2 0.711
17 3 Antofagasta Antofagasta Sierra Gorda 12886.0 1206 0.1 0.789
18 3 Antofagasta Antofagasta Taltal 20405.1 13493 0.5 0.716
19 3 Antofagasta Tocopilla Tocopilla 4038.8 20091 5.9 0.690
20 4 Atacama Huasco Alto del Carmen 5939.0 5488 0.8 0.664
21 4 Atacama Copiapó Caldera 4666.6 16150 2.9 0.741
22 4 Atacama Chañaral Chañaral 5772.0 14146 2.2 0.714
23 4 Atacama Copiapó Copiapó 16681.3 158261 11.9 0.725
24 4 Atacama Chañaral Diego de Almagro 18664.0 16452 1.0 0.789
25 4 Atacama Huasco Freirina 3207.0 6531 1.7 0.693
26 4 Atacama Huasco Huasco 1601.4 9015 5.6 0.695
27 4 Atacama Copiapó Tierra Amarilla 11191.0 13912 1.1 0.686
28 4 Atacama Huasco Vallenar 7084.0 52099 6.7 0.731
29 5 Coquimbo Elqui Andacollo 310.0 11116 33.1 0.675
... ... ... ... ... ... ... ... ...
316 13 Los Lagos Osorno Purranque 1458.8 20768 14.2 0.627
317 13 Los Lagos Osorno Puyehue 1597.9 11370 7.1 0.675
318 13 Los Lagos Chiloé Queilén 332.9 5319 16.0 0.646
319 13 Los Lagos Chiloé Quellón 3244.0 30964 9.5 0.670
320 13 Los Lagos Chiloé Quemchi 440.3 9191 20.9 0.656
321 13 Los Lagos Chiloé Quinchao 160.7 9043 56.3 0.648
322 13 Los Lagos Osorno Río Negro 1265.7 13425 10.6 0.633
323 13 Los Lagos Osorno San Juan de la Costa 1517.0 7997 5.3 0.510
324 13 Los Lagos Osorno San Pablo 637.3 9150 14.4 0.625
325 14 Aisén del General Carlos Ibáñez del Campo Aysén Aysén 29796.4 27187 0.9 0.674
326 14 Aisén del General Carlos Ibáñez del Campo General Carrera Chile Chico 5737.1 5334 0.8 0.707
327 14 Aisén del General Carlos Ibáñez del Campo Aysén Cisnes 16093.0 6166 0.3 0.725
328 14 Aisén del General Carlos Ibáñez del Campo Capitán Prat Cochrane 8599.5 2759 0.3 0.668
329 14 Aisén del General Carlos Ibáñez del Campo Coyhaique Coyhaique 7290.2 59221 6.8 0.751
330 14 Aisén del General Carlos Ibáñez del Campo Aysén Guaitecas 620.6 1862 2.4 0.654
331 14 Aisén del General Carlos Ibáñez del Campo Coyhaique Lago Verde 5422.3 925 0.2 0.637
332 14 Aisén del General Carlos Ibáñez del Campo Capitán Prat O'Higgins 8182.5 700 0.1 0.572
333 14 Aisén del General Carlos Ibáñez del Campo General Carrera Río Ibáñez 5997.2 2208 0.4 0.654
334 14 Aisén del General Carlos Ibáñez del Campo Capitán Prat Tortel 19710.6 531 0.0 0.655
335 15 Magallanes y de la Antártica Chilena Antártica Chilena Antártica 1250257.6 127 0.0 NaN
336 15 Magallanes y de la Antártica Chilena Antártica Chilena Cabo de Hornos 15578.7 1677 0.1 0.806
337 15 Magallanes y de la Antártica Chilena Magallanes Laguna Blanca 3695.6 631 0.0 0.785
338 15 Magallanes y de la Antártica Chilena Última Esperanza Natales 49924.1 21327 0.4 0.699
339 15 Magallanes y de la Antártica Chilena Tierra del Fuego Porvenir 9707.4 5650 0.8 0.731
340 15 Magallanes y de la Antártica Chilena Tierra del Fuego Primavera 4253.4 803 0.2 0.774
341 15 Magallanes y de la Antártica Chilena Magallanes Punta Arenas 17846.3 125483 6.8 0.748
342 15 Magallanes y de la Antártica Chilena Magallanes Río Verde 17248.0 363 0.0 0.784
343 15 Magallanes y de la Antártica Chilena Magallanes San Gregorio 6883.7 731 0.1 0.823
344 15 Magallanes y de la Antártica Chilena Tierra del Fuego Timaukel 10758.9 873 0.1 0.717
345 15 Magallanes y de la Antártica Chilena Última Esperanza Torres del Paine 6630.0 1163 0.1 0.730

346 rows × 8 columns


In [36]:
data.sort_values('Poblacion')


Out[36]:
RegionID Region Provincia Comuna Superficie Poblacion Densidad IDH_2005
335 15 Magallanes y de la Antártica Chilena Antártica Chilena Antártica 1250257.6 127 0.0 NaN
15 3 Antofagasta El Loa Ollagüe 2964.0 332 0.1 0.679
342 15 Magallanes y de la Antártica Chilena Magallanes Río Verde 17248.0 363 0.0 0.784
334 14 Aisén del General Carlos Ibáñez del Campo Capitán Prat Tortel 19710.6 531 0.0 0.655
337 15 Magallanes y de la Antártica Chilena Magallanes Laguna Blanca 3695.6 631 0.0 0.785
1 1 Arica y Parinacota Arica Camarones 3927.0 679 0.3 0.751
332 14 Aisén del General Carlos Ibáñez del Campo Capitán Prat O'Higgins 8182.5 700 0.1 0.572
343 15 Magallanes y de la Antártica Chilena Magallanes San Gregorio 6883.7 731 0.1 0.823
2 1 Arica y Parinacota Parinacota General Lagos 2244.4 739 0.5 0.670
55 6 Valparaíso Valparaíso Juan Fernández 149.4 792 4.0 0.744
340 15 Magallanes y de la Antártica Chilena Tierra del Fuego Primavera 4253.4 803 0.2 0.774
344 15 Magallanes y de la Antártica Chilena Tierra del Fuego Timaukel 10758.9 873 0.1 0.717
331 14 Aisén del General Carlos Ibáñez del Campo Coyhaique Lago Verde 5422.3 925 0.2 0.637
5 2 Tarapacá Tamarugal Camiña 2200.2 1156 0.5 0.619
345 15 Magallanes y de la Antártica Chilena Última Esperanza Torres del Paine 6630.0 1163 0.1 0.730
17 3 Antofagasta Antofagasta Sierra Gorda 12886.0 1206 0.1 0.789
6 2 Tarapacá Tamarugal Colchane 4015.6 1384 0.4 0.603
3 1 Arica y Parinacota Parinacota Putre 5902.5 1462 0.2 0.707
311 13 Los Lagos Palena Palena 2763.7 1665 0.6 0.667
336 15 Magallanes y de la Antártica Chilena Antártica Chilena Cabo de Hornos 15578.7 1677 0.1 0.806
305 13 Los Lagos Palena Futaleufú 1280.0 1836 1.4 0.665
330 14 Aisén del General Carlos Ibáñez del Campo Aysén Guaitecas 620.6 1862 2.4 0.654
333 14 Aisén del General Carlos Ibáñez del Campo General Carrera Río Ibáñez 5997.2 2208 0.4 0.654
7 2 Tarapacá Tamarugal Huara 10474.6 2360 0.2 0.676
328 14 Aisén del General Carlos Ibáñez del Campo Capitán Prat Cochrane 8599.5 2759 0.3 0.668
159 8 Libertador General Bernardo O'Higgins Colchagua Pumanque 441.0 3458 7.8 0.635
237 10 Biobío Ñuble San Fabián 1568.3 3503 2.2 0.618
241 10 Biobío Biobío San Rosendo 92.4 3627 39.3 0.647
231 10 Biobío Biobío Quilaco 1123.7 3722 3.3 0.635
198 10 Biobío Biobío Antuco 1884.1 3774 2.0 0.662
... ... ... ... ... ... ... ... ...
70 6 Valparaíso Marga Marga Quilpué 537.0 162320 280.7 0.752
90 7 Metropolitana de Santiago Santiago El Bosque 14.2 162671 12365.0 0.711
294 12 Los Ríos Valdivia Valdivia 1016.0 163148 138.4 0.754
244 10 Biobío Concepción Talcahuano 92.3 171673 1859.9 0.731
204 10 Biobío Ñuble Chillán 511.2 175585 343.5 0.714
172 9 Maule Curicó Curicó 1328.0 177766 133.8 0.710
8 2 Tarapacá Iquique Iquique 2242.1 184953 82.4 0.766
110 7 Metropolitana de Santiago Santiago Ñuñoa 16.9 195410 8937.5 0.860
220 10 Biobío Biobío Los Ángeles 1748.2 195813 112.0 0.696
99 7 Metropolitana de Santiago Santiago La Pintana 30.6 202146 8511.8 0.679
32 5 Coquimbo Elqui Coquimbo 1429.0 202441 141.6 0.731
120 7 Metropolitana de Santiago Santiago Quilicura 58.0 203946 3486.6 0.782
0 1 Arica y Parinacota Arica Arica 4799.4 210936 38.4 0.736
35 5 Coquimbo Elqui La Serena 1892.8 211275 132.5 0.781
118 7 Metropolitana de Santiago Santiago Pudahuel 197.0 225509 993.7 0.735
209 10 Biobío Concepción Concepción 221.6 227768 1027.8 0.757
130 7 Metropolitana de Santiago Santiago Santiago 22.0 237369 8654.8 0.807
312 13 Los Lagos Llanquihue Puerto Montt 1673.0 248945 142.5 0.718
115 7 Metropolitana de Santiago Santiago Peñalolén 54.0 249621 4001.1 0.743
161 8 Libertador General Bernardo O'Higgins Cachapoal Rancagua 260.3 250638 823.4 0.732
192 9 Maule Talca Talca 232.0 264842 928.5 0.731
276 11 La Araucanía Cautín Temuco 464.0 269992 8039.0 0.763
124 7 Metropolitana de Santiago Maipo San Bernardo 155.0 277802 1974.0 0.712
102 7 Metropolitana de Santiago Santiago Las Condes 99.0 289949 2524.2 0.933
78 6 Valparaíso Valparaíso Valparaíso 401.6 308137 687.2 0.701
80 6 Valparaíso Valparaíso Viña del Mar 121.6 311399 2560.8 0.766
11 3 Antofagasta Antofagasta Antofagasta 30718.1 348669 9.7 0.734
97 7 Metropolitana de Santiago Santiago La Florida 70.2 397497 5209.0 0.804
119 7 Metropolitana de Santiago Cordillera Puente Alto 88.0 757721 6664.8 0.773
107 7 Metropolitana de Santiago Santiago Maipú 135.5 805000 3876.2 0.902

346 rows × 8 columns


In [37]:
data.sort_values('Poblacion', ascending=False)


Out[37]:
RegionID Region Provincia Comuna Superficie Poblacion Densidad IDH_2005
107 7 Metropolitana de Santiago Santiago Maipú 135.5 805000 3876.2 0.902
119 7 Metropolitana de Santiago Cordillera Puente Alto 88.0 757721 6664.8 0.773
97 7 Metropolitana de Santiago Santiago La Florida 70.2 397497 5209.0 0.804
11 3 Antofagasta Antofagasta Antofagasta 30718.1 348669 9.7 0.734
80 6 Valparaíso Valparaíso Viña del Mar 121.6 311399 2560.8 0.766
78 6 Valparaíso Valparaíso Valparaíso 401.6 308137 687.2 0.701
102 7 Metropolitana de Santiago Santiago Las Condes 99.0 289949 2524.2 0.933
124 7 Metropolitana de Santiago Maipo San Bernardo 155.0 277802 1974.0 0.712
276 11 La Araucanía Cautín Temuco 464.0 269992 8039.0 0.763
192 9 Maule Talca Talca 232.0 264842 928.5 0.731
161 8 Libertador General Bernardo O'Higgins Cachapoal Rancagua 260.3 250638 823.4 0.732
115 7 Metropolitana de Santiago Santiago Peñalolén 54.0 249621 4001.1 0.743
312 13 Los Lagos Llanquihue Puerto Montt 1673.0 248945 142.5 0.718
130 7 Metropolitana de Santiago Santiago Santiago 22.0 237369 8654.8 0.807
209 10 Biobío Concepción Concepción 221.6 227768 1027.8 0.757
118 7 Metropolitana de Santiago Santiago Pudahuel 197.0 225509 993.7 0.735
35 5 Coquimbo Elqui La Serena 1892.8 211275 132.5 0.781
0 1 Arica y Parinacota Arica Arica 4799.4 210936 38.4 0.736
120 7 Metropolitana de Santiago Santiago Quilicura 58.0 203946 3486.6 0.782
32 5 Coquimbo Elqui Coquimbo 1429.0 202441 141.6 0.731
99 7 Metropolitana de Santiago Santiago La Pintana 30.6 202146 8511.8 0.679
220 10 Biobío Biobío Los Ángeles 1748.2 195813 112.0 0.696
110 7 Metropolitana de Santiago Santiago Ñuñoa 16.9 195410 8937.5 0.860
8 2 Tarapacá Iquique Iquique 2242.1 184953 82.4 0.766
172 9 Maule Curicó Curicó 1328.0 177766 133.8 0.710
204 10 Biobío Ñuble Chillán 511.2 175585 343.5 0.714
244 10 Biobío Concepción Talcahuano 92.3 171673 1859.9 0.731
294 12 Los Ríos Valdivia Valdivia 1016.0 163148 138.4 0.754
90 7 Metropolitana de Santiago Santiago El Bosque 14.2 162671 12365.0 0.711
70 6 Valparaíso Marga Marga Quilpué 537.0 162320 280.7 0.752
... ... ... ... ... ... ... ... ...
198 10 Biobío Biobío Antuco 1884.1 3774 2.0 0.662
231 10 Biobío Biobío Quilaco 1123.7 3722 3.3 0.635
241 10 Biobío Biobío San Rosendo 92.4 3627 39.3 0.647
237 10 Biobío Ñuble San Fabián 1568.3 3503 2.2 0.618
159 8 Libertador General Bernardo O'Higgins Colchagua Pumanque 441.0 3458 7.8 0.635
328 14 Aisén del General Carlos Ibáñez del Campo Capitán Prat Cochrane 8599.5 2759 0.3 0.668
7 2 Tarapacá Tamarugal Huara 10474.6 2360 0.2 0.676
333 14 Aisén del General Carlos Ibáñez del Campo General Carrera Río Ibáñez 5997.2 2208 0.4 0.654
330 14 Aisén del General Carlos Ibáñez del Campo Aysén Guaitecas 620.6 1862 2.4 0.654
305 13 Los Lagos Palena Futaleufú 1280.0 1836 1.4 0.665
336 15 Magallanes y de la Antártica Chilena Antártica Chilena Cabo de Hornos 15578.7 1677 0.1 0.806
311 13 Los Lagos Palena Palena 2763.7 1665 0.6 0.667
3 1 Arica y Parinacota Parinacota Putre 5902.5 1462 0.2 0.707
6 2 Tarapacá Tamarugal Colchane 4015.6 1384 0.4 0.603
17 3 Antofagasta Antofagasta Sierra Gorda 12886.0 1206 0.1 0.789
345 15 Magallanes y de la Antártica Chilena Última Esperanza Torres del Paine 6630.0 1163 0.1 0.730
5 2 Tarapacá Tamarugal Camiña 2200.2 1156 0.5 0.619
331 14 Aisén del General Carlos Ibáñez del Campo Coyhaique Lago Verde 5422.3 925 0.2 0.637
344 15 Magallanes y de la Antártica Chilena Tierra del Fuego Timaukel 10758.9 873 0.1 0.717
340 15 Magallanes y de la Antártica Chilena Tierra del Fuego Primavera 4253.4 803 0.2 0.774
55 6 Valparaíso Valparaíso Juan Fernández 149.4 792 4.0 0.744
2 1 Arica y Parinacota Parinacota General Lagos 2244.4 739 0.5 0.670
343 15 Magallanes y de la Antártica Chilena Magallanes San Gregorio 6883.7 731 0.1 0.823
332 14 Aisén del General Carlos Ibáñez del Campo Capitán Prat O'Higgins 8182.5 700 0.1 0.572
1 1 Arica y Parinacota Arica Camarones 3927.0 679 0.3 0.751
337 15 Magallanes y de la Antártica Chilena Magallanes Laguna Blanca 3695.6 631 0.0 0.785
334 14 Aisén del General Carlos Ibáñez del Campo Capitán Prat Tortel 19710.6 531 0.0 0.655
342 15 Magallanes y de la Antártica Chilena Magallanes Río Verde 17248.0 363 0.0 0.784
15 3 Antofagasta El Loa Ollagüe 2964.0 332 0.1 0.679
335 15 Magallanes y de la Antártica Chilena Antártica Chilena Antártica 1250257.6 127 0.0 NaN

346 rows × 8 columns


In [40]:
(data.groupby(['Region'])['Poblacion','Superficie'].sum())


Out[40]:
Poblacion Superficie
Region
Aisén del General Carlos Ibáñez del Campo 106893 107449.40
Antofagasta 551627 126048.80
Arica y Parinacota 308271 17446.20
Atacama 292054 74806.30
Biobío 2025995 37068.70
Coquimbo 714856 40967.80
La Araucanía 933537 31842.30
Libertador General Bernardo O'Higgins 903248 16583.30
Los Lagos 835829 48583.30
Los Ríos 380618 18577.60
Magallanes y de la Antártica Chilena 158828 1392783.70
Maule 1073635 30340.30
Metropolitana de Santiago 7150480 15547.00
Tarapacá 205566 41632.60
Valparaíso 1859312 12646.28

In [48]:
(data.groupby(['Region'])['Poblacion','Superficie'].sum()).sort_values('Poblacion', ascending=False)


Out[48]:
Poblacion Superficie
Region
Metropolitana de Santiago 7150480 15547.00
Biobío 2025995 37068.70
Valparaíso 1859312 12646.28
Maule 1073635 30340.30
La Araucanía 933537 31842.30
Libertador General Bernardo O'Higgins 903248 16583.30
Los Lagos 835829 48583.30
Coquimbo 714856 40967.80
Antofagasta 551627 126048.80
Los Ríos 380618 18577.60
Arica y Parinacota 308271 17446.20
Atacama 292054 74806.30
Tarapacá 205566 41632.60
Magallanes y de la Antártica Chilena 158828 1392783.70
Aisén del General Carlos Ibáñez del Campo 106893 107449.40

In [49]:
data.sort_values(['RegionID']).groupby(['RegionID','Region'])['Poblacion','Superficie'].sum()


Out[49]:
Poblacion Superficie
RegionID Region
1 Arica y Parinacota 308271 17446.20
2 Tarapacá 205566 41632.60
3 Antofagasta 551627 126048.80
4 Atacama 292054 74806.30
5 Coquimbo 714856 40967.80
6 Valparaíso 1859312 12646.28
7 Metropolitana de Santiago 7150480 15547.00
8 Libertador General Bernardo O'Higgins 903248 16583.30
9 Maule 1073635 30340.30
10 Biobío 2025995 37068.70
11 La Araucanía 933537 31842.30
12 Los Ríos 380618 18577.60
13 Los Lagos 835829 48583.30
14 Aisén del General Carlos Ibáñez del Campo 106893 107449.40
15 Magallanes y de la Antártica Chilena 158828 1392783.70

In [ ]: